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Covering arrays are structures for well- 
representing extremely large input spaces 
and are used to efficiently implement 
blackbox testing for software and hard- 
ware. This paper proposes refinements 
over the In-Parameter-Order strategy (for 
arbitrary i). When constructing homoge- 
neous-alphabet covering arrays, these 
refinements reduce runtime in nearly all 
cases by a factor of more than 5 and in 
some cases by factors as large as 280. This 
trend is increasing with the number of 
columns in the covering array. Moreover, 
the resulting covering arrays are about 5 % 
smaller. Consequently, this new algorithm 
has constructed many covering arrays that 
are the smallest in the literature. A heuris- 
tic variant of the algorithm sometimes pro- 
duces comparably sized covering arrays 
while running significantly faster. 
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1. Introduction 

For the purposes of this paper, a v-valued n x k cov- 
ering array of strength t with integer parameters n, k, t, 
and V, where v, t>2, k> t, and n > is a matrix C of 
size n x ^ with entries from {0, 1, ..., v - 1} which has 
the property that each submatrix of size nx t has among 
its rows all of the v^ possible tuples (xi, ..., xj of inte- 



gers where < X/ < v for each index z e {1, ..., /}. For 
the rest of this paper the four parameters will be implic- 
it and we will just refer to such arrays as covering 
arrays. As the value of v is constant over the columns 
of the array, this is a homogeneous alphabet covering 
array. This concept can be generalized to heteroge- 
neous alphabets so that each column, y, has a different 
Vj, but this paper will not discuss such cases as while all 
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ideas presented apply, the empirical results have not 
been well explored. 

A set A of ^ indicesy'i, ...,7^, where 1 <ji < ••• <jt < k, 
together with a function v on A such that v (y) g {0, ..., 
V - 1 } for each 7 e A will be called a ^tuple, with A 
referred to as the column tuple and v the value tuple. A 
row (xi,. ..., Xjt) is said to cover a ^tuple (A, v) provid- 
ed that Xj= V ( j) for each 7 g A. Thus C is a covering 
array if and only if each ^tuple is covered by at least 
one row in C. 

Software testing is often done with test inputs sam- 
pled from a large input space. Taking each row in the 
covering array as a test from a sample space of size v^ 
allows covering arrays to identify which test inputs 
should be used to check software validity. This is desir- 
able because covering arrays well-represent the full 
sample space by covering all ^tuples. Theoretical 
results [3] tell us that covering arrays need not require 
more than approximately tv^ log(vA:) tests, which is 
drastically less than the full-testing that covering arrays 
approximate. Thus, covering arrays are used as an effi- 
cient way for picking tests for software [5,7,9]. For a 
survey of covering arrays in the binary (v = 2) case, see 
[6], 

Several algorithms for constructing covering arrays 
suitable for software testing have been developed. 
Some use the "In-Parameter-Order," or IPO, strategy 
[7,9]. Here, some refinements of this strategy are pro- 
posed and studied. There are two competing goals for 
algorithms that construct covering arrays: to minimize 
the time required to produce the array, and to minimize 
the number of rows, n, in the array. In this paper we 
present changes to IPO which empirically reduce both 
the execution time and the resulting covering array 
size. 

The original IPO strategy was implemented for 
2-way coverage, that is, for t = 2. However, the principle 
behind the algorithm of treating the columns (parame- 
ters) one by one applies for all t, and the IPO strategy 
is being used as a starting point for the efficient produc- 
tion of covering arrays for values of ^ up to 6. This more 
general endeavor is designated IPOG ("In-Parameter- 
Order-Generalized"), which is a centerpiece of the 
Automated Combinatorial Testing for Software project 
at the National Institute of Standards and Technology 
(NIST). See [8,10] for current information on IPOG. 
The new strategy incorporates some modifications to 
IPOG which are intended as an aid in constructing cov- 
ering arrays for this project. Some results of a prelimi- 
nary evaluation of these ideas are presented. The tables 
of covering arrays [11] are products of the use of these 
ideas. 



2. The IPO Framework 

We briefly describe the operation of IPO. Unlike 
many other algorithms that build covering arrays one 
row at a time, the IPO strategy builds covering arrays 
one column at a time. Specifically, it uses the idea that 
covering arrays of ^ - 1 parameters can be used to effi- 
ciently build a covering array of k parameters. 
Applying this induction with the trivial base case k= t 
allows for generating any covering array desired. 

To construct the covering array, first make a matrix 
for the first t parameters which contains each of the 
possible V ^ distinct rows having entries from {0, . . . , v - 1 } . 
This matrix will be of size v^ ^ t. Then, for each addi- 
tional parameter, perform the following two steps. 

• Horizontal growth: Add an additional column 
(corresponding to the new parameter) and fill in its 
values. 

• Vertical growth: For each column tuple, if some 
value tuple fails to appear, add a new row to cover 
this ^tuple. 

There are many ways to implement the procedures 
for horizontal and vertical growth. Horizontal growth 
procedures must address which values are assigned to 
each entry in the new column. The greedy idea of 
choosing values that maximize the number of covered 
^tuples has been shown to produce fairly small cover- 
ing arrays in an efficient manner, but there is still the 
question of which order to fill in the entries in the new 
column. Two options are presented in [7], one uses the 
row order and another tries all possible orders. This 
paper will explore a third option which greedily picks 
the row order. 

Vertical growth algorithms must decide how to add 
rows onto a covering array in a way that covers the t- 
tuples not covered by horizontal growth. One method, 
as described in [7], is to add rows that are specified 
with only as much detail as desired, such as to cover a 
specific ^tuple, while leaving the rest of the row filled 
with "don't care" values. That is, the symbol may 
appear as an entry in the matrix in addition to the inte- 
gers 0, ..., V - 1, indicating that the value of that entry 
from {0, ..., V - 1} has not yet been determined. Using 
allows the algorithm to defer this determination until 
more information is present. The values are replaced 
with integers either during the same stage of vertical 
growth or perhaps during another stage of vertical 
growth after more parameters have been added to the 
array. 

In case t = 2, there is a natural method of implement- 
ing the vertical growth procedure, as described in [7]. 
For each value x from {0, ..., v - 1} such that there 
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exists a pair of columns (one necessarily the last) not 
containing all of the v value tuples having x in the last 
column, add a row to the matrix which has x in the last 
column and each earlier column has a value such that 
the 2-tuple composed of this value with x has not been 
covered, or if there is no such value, has 0. Denoting 
f(x) the maximum number of such other values in any 
other column, we see that at most l^^fix) < v^ new rows 
will be added at this stage. Furthermore, it is clear that 
in any vertical growth algorithm extending the already 
present rows, at least X^/(x) rows are required. In this 
sense the procedure is optimal. The procedure, but not 
the sense of optimality, has been extended in the IPOG 
algorithm for arbitrary t: by induction of coverage over 
the columns, we only need to examine ^tuples with 
k G A, SO for each value of v we determine whether 
that ^tuple (A, v ) is covered in the array. There would 
be v^(^^/) such ^tuples. Any uncovered Muple would be 
added to the array by placing it directly in the array by 
replacing entries with a ^ or by appending a row filled 
with to the array and then inserting the ^tuple. This 
paper primarily focuses on changes to the horizontal 
growth algorithm, but some minor changes to the verti- 
cal growth algorithm are also discussed. 



3, The Main Algorithm 

3.1 An Explanation of the Algorithm 

The original IPO algorithm is composed of the hori- 
zontal growth stage and the vertical growth stage, and 
ideally we know good algorithms for both stages. 
However, horizontal growth constitutes most of the 
runtime and intuitively can be seen to be the critical 
factor in determining how small the resulting covering 
array is, as it determines how many ^tuples, and thus 
rows, vertical growth must place. When ^ = 2, as we 
saw above, X^/(x) was the number of rows added by 
vertical growth, but it was determined by the horizon- 
tal, not vertical, growth stage. Thus, this paper seeks to 
explore the more important of the two stages in IPO by 
showing how broadening the search space of horizontal 
growth can increase the optimality of the results and 
decrease runtime. 

The horizontal growth stage takes a covering array of 
k- I columns and extends it to an array of ^ columns 
by adding one column to the old array, thereby "extend- 
ing" each row with some value. Any remaining uncov- 
ered ^tuples will be covered in the vertical growth 
stage. The choice of which rows will be extended with 



which values is the critical step in how any algorithm 
following this framework operates. The original IPO 
algorithm examines the rows in order and greedily 
selects the value to extend each row with. This paper 
outlines an algorithm that allows for greedy selection 
over both the row and value with which we extend the 
array. The metric for the greedy selection is unchanged: 
we want to pick an extension of the array that covers as 
many previously uncovered ^tuples as possible. 

This generalization offers the possibility of produc- 
ing smaller covering arrays by virtue of the larger 
search space for the greedy choice. At first glance it 
seems unlikely that this approach can achieve a practi- 
cal runtime. A naive implementation that simply broad- 
ens the search space without major algorithmic alter- 
ations would incur a large performance cost because 
each time an extension is performed, all row/value 
pairs would be checked for the additional coverage they 
offer. (^^/) ^tuples may have their coverage affected by 
the extension and under a naive approach, all must be 
checked. This implementation would thus have a run- 
time of 0(vr^(^^/)) as opposed to the better runtime of 
the original strategy, which is 0(vr(^^"/)). 

Fortunately the naive implementation performs 
more work than necessary and can be improved. In the 
expanded search space, any algorithm must examine 
row/value pairs for their possible additional coverage 
multiple times, but a naive approach simply performs 
the calculation again. By using dynamic programming 
to store and update this information appropriately this 
larger search space can be explored much faster. 

Consider a non-extended row. Extending it with 
some value will cover (fl^) ^tuples, some of which may 
have already been covered and others which would be 
newly covered. We can express this relationship with 



L+t, 



.=e^') 



(1) 



with t^ denoting the "covered" ^tuples that have previ- 
ously been covered by already extended rows and t^ 
denoting the number of new ^tuples the row/value pair 
would cover if we choose that extension. As we want to 
maximize additional coverage, t^ is the metric used to 
gauge which row/value pairs to extend the array with. 
The value of t^ will not increase as we extend more 
rows in the array. The naive implementation directly 
maintains t^ . However, if we use (1) with dynamic pro- 
gramming and maintain t^ directly (and thus t^ indirect- 
ly) we get the same greedy metric but, as will be 
shown, in a more efficient manner. 
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To maintain t^ we have two arrays, T^ [r, v] and 
Cov[A, V ] . r^ is indexed by a row and a value and 
stores tc for this row/value pair. This takes B(rv) space. 
Cov[A, V ] is a boolean array indexed by the column 
tuple A and value tuple v and entries indicate whether 
the /-tuple (A, v ) is covered. The IPO framework guar- 
antees that the first k - \ columns form a covering 
array so we only need to consider the ^tuples that have 
the last column in their column tuple, A; therefore the 
array Cov takes 0(v^(^^/)) space. Initially T^ is filled 
with zeros and Cov is filled with false's. When the 
greedy selection occurs and a row is extended with a 
certain value, both of the arrays must be updated. 

We take a brief moment to discuss how the arrays are 
indexed. T^ is straightforward, but Cov is more complex 
as A and v need an efficient scheme to be represented 
as numbers for Cov to be a conventional array. To do 
this, we note that combinations can be lexicographical- 
ly ordered and the position of a combination A can be 
used as its numerical hash for accessing the array; see 
Knuth's Theorem L [4]. As we have a homogeneous 
alphabet covering array we treat v as a number with 
base V. 

Consider when we just extended the row is {\,,„,n} 
with the value a e {0, ..., v - 1}, which we denote as 
having performed the row extension (/, a). We then 
need to update the coverage count T^j, b'\ for all other 
extensions (y, Z?), as well as which /-tuples have been 
covered as reflected by the Cov array. 

Tc[j'> b'] already has the "covered" /-tuple count for 
(y, b) prior to extending the array with the pair (/, a). 
We need to count among the Qll) /-tuples the extension 
(y, b) covers which ones are "newly" covered by (/, a) 
and thus no longer contribute to the /^ value for (y, b). 
A naive way to update T^ would be to check all G-~/) /- 
tuples. However, this offers no time savings over the 
naive implementation discussed earlier. 

To achieve time savings, first notice that when we 
extended the array with (/, a), the only extensions (y, b) 
that can have their /^ value change are those for which 
b = a. This is again from the inductive fact that the first 
k - 1 columns form a covering array so we are only 
considering /-tuples that have a A that includes the last 
column. So if Z? ;^ a, then the extension (y, b) cannot 
have any of its Qll) /-tuples covered by the extension 
(/, a), so we need not update T^j, b']. Thus, we only need 
to consider row/value pairs (y, a) when updating T^. 

Second, instead of examining all Qll) /-tuples, we 
can restrict ourselves to a smaller set. If a /-tuple (A, v ) 
was freshly covered by (/, a) and would also be covered 
by the possible row extension (y, a) then this means 



that for each / e A, v (/) is the entry in both positions 
(/, I) and (y, I) in the array. Therefore, the freshly cov- 
ered /-tuples in rowy are the /-tuples with a A that is a 
subset of the columns where row / and rowy have iden- 
tical entries. 

This observation gives us the procedure we desire. 
Begin by examining the columns where the newly- 
extended row / and a rowy" have identical entries. Any 
freshly covered /-tuple in rowy must have its column 
tuple entirely within these "shared columns." If there 
are s shared columns, then we only need explore {^1-^ 
values for A, as the last column must be in A. Notice 
that V is completely specified by A and the two rows 
we are comparing. For each /-tuple "shared" between 
the two rows, we check Cov[A, v ] to see whether the /- 
tuple was covered previously. If so, this /-tuple doesn't 
affect TXJ, ci\. Otherwise, T^j, a] is increased by one. 
These steps keep T^ updated. 

After updating T^ for each non-extended row/value 
pair, Cov is updated by marking all (^ll) /-tuples cov- 
ered by the extension (/, a) as "covered" if they were 
not so already. 

With the update step for T^ and Cov explained, the 
whole algorithm can be discussed. First, all non- 
extended rows are searched, calculating the /„ values 
for each row/value pair from the T^ array. A row/value 
pair is chosen greedily, with ties broken randomly, and 
that extension is performed. The update step then 
occurs and the process is repeated until either there are 
no rows to extend or no additional coverage would 
result from further extensions. 

There are several nuances that still need to be dis- 
cussed. Of principle concern is the search for the max- 
imum /„ value. Searching through all non-extended 
rows for the maximum /„ value is wasteful. Recall that 
extending one row with a value a can only affect other 
row/value pairs that would also extend a row with the 
value a. Thus, the /„ values only change on roughly \ of 
the iterations. Further, the /„ values do not increase, so 
an intelligent data structure, such as a priority queue, 
could be used to exploit such properties in a highly effi- 
cient manner. However the large number of row/value 
pairs and the fact that we only extend with one pair for 
each row makes it seem wasteful in both time and space 
to maintain this data structure. Instead, a list of 
row/value pairs with the maximum /^ value is main- 
tained. Since the /„ values do not increase the list can 
never get bigger until it completely runs out. Thus, 
instead of searching all row/value pairs, we simply pick 
a random candidate off the list and then prune the list of 
those row/value pairs which had their /„ value decrease 
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due to result of the update to T^. When the hst runs out, 
we search all of the row/value pairs to find those that 
attain the maximum ^ value. Experimental evidence 
suggests that for values of v of at least 10, the hst runs 
out infrequently. For small values of v the lists' size of 
0{rv) seems small enough to be manageable. Thus a 
better data structure seems unwarranted overall. 

Another nuance is the presence of the "don't care" 
values. These entries are unspecified so far because no 
additional coverage for the subarray ofk-\ parameters 
could be gained by specifying its value. They clearly 
have potential for additional coverage for the k param- 
eters. To incorporate this potential into the 4 values, 
one possible method would be to treat the don't-care 
values in a way that assumes they maximize their 
potential. This seems difficult to implement in an effi- 
cient manner, and in particular, does not seem to fit in 
well with (I), the equation driving this entire approach. 
Instead, the choice was made that the potential of don't- 
care values will be ignored during horizontal growth 
and if possible, don't-care values will be replaced dur- 
ing vertical growth. To achieve this, we restate the rela- 
tion as 4 + /c = (m)? where g is the number of "good", 
or specified, columns in the row (excluding the last col- 
umn). 

The vertical growth algorithm is virtually identical to 
the original idea in IPOG. However, since some rows 
may be non-extended in horizontal growth some slight 
modifications have been made. When a /-tuple needs to 
be covered in vertical growth, all rows are searched for 
a suitable position and the first match is taken. Don't- 
care values are filled in accordingly. To save time, the 
search is started in the first row where there are don't- 
care values because some part of the row must be 
unspecified for a /-tuple to be placed there. 

3.2 Pseudocode for the Algorithm 

Algorithm 1 presents pseudocode for horizontal 
growth. For simplicity, the pseudocode does not imple- 
ment the list of candidate row/value pairs and does not 
address the don't-care values. 



Algorithm 1 Horizontal Growth 



TXi, a] <— 0, V/, a 

Cov[A, v] <^ false, VA, v 

while some row is non-extended do 

Find non-extended row / and value a so that t^ = (^ll) - 
T^[i, a] is maximum 
if /, = then 

stop horizontal growth 
end if 

Extend row / with value a 
for all non-extended rowsy do 

S <— set of columns where row / andy have identical 

entries 

for all column tuples A(zS do 

V <r- the value tuple in row / and column tuple A 
if Cov[A, v] = false then 
TU a] ^ r,[7, a] + 1 
end if 
end for 
end for 
for all column tuples A do 

V <— the value tuple in row r and column tuple A 
if Cov[A, v] = false then 

C<9v[A, v] <- true 
end if 
end for 
end while 



Algorithm 2 presents pseudocode for vertical 
growth. With this pseudocode we can discuss further 
implementation details. Examine T^. One could simply 
use the array Cov from horizontal growth to determine 
which /-tuples are uncovered. However this approach 
only works for / = 2, as for general / when a ^tuple is 
placed in the array it could also create new coverage 
that was unintended. It is important to capture the unin- 
tended coverage, so we cannot just use Cov. Instead we 
calculate T^ for each A and doing so frilly captures all 
coverage. A naive implementation to calculating 7\ 
would require searching all rows and calculating the 
value of V specified by that row and the given value of 
A, and removing the ^tuple (v. A) from a list of /-tuples 
to cover. However, we get this information faster by 
traversing the column tuples A in a structured way. 
Using recursion, we can traverse the column tuples in a 
lexicographic order which infrequently changes many 
of the columns in A. Thus, for a given row, the numer- 
ical hash of V that results also changes in a structured 
way as the entries in the value tuple also infrequently 
change. This clearly can be exploited to achieve time 
savings but the details will not be presented. 
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Algorithm 2 Vertical Growth 



for all column tuples A: k e A do 

Tj^ ^ list of uncovered ^tuples with this A 
for all V : (A, v) e T^ do 

for all rows / with a don't-care entry with a col- 
umn in A do 

if we can place (A, v) then 

place (A, v) 
end if 
end for 
if (A, v) not placed yet then 

add a new row with (A, v) as the only entries 
end if 
end for 
end for 



3.3 Analysis 

The space complexity of this algorithm was dis- 
cussed above, with the requirements mainly driven by 
the two arrays T^ and Cov, requiring 0(rv) and 0(v^(^^/)) 
space respectively. We now turn to the time analysis. 

The performance of the algorithm in practice has 
proven to be very competitive as will be discussed in 
Sec. 5. Theoretically the properties of this algorithm are 
less clear. The complexity of the algorithm suggests no 
optimality guarantee is possible and unfortunately even 
a rigorous time bound is elusive. While the algorithm is 
not randomized (except for minor portions), the notion 
of "average" is needed here since what the worst-case 
input would be is a property of covering arrays that is 
unknown. Thus, we explore the runtime using some 
heuristic arguments. 

Let's first look at one stage of the horizontal growth 
algorithm. For each non-extended row, we first must 
search in O (rv) time for the extension (/, a) to perform. 
The candidate list has not been thoroughly explored to 
give a better guarantee. Then, for each non-extended 
row, y, we must explore the (^!i) /-tuples needed to 
update TJ^j, a]. Calculating the v required to index 
Cov seems like it might take 0(/) time, but the lexico- 
graphic ordering mentioned earlier allows this to be 
done in amortized 0(1) time so exploring the /-tuples 
takes 0((,!i))- We then iterate through the C-^) /-tuples 
(/, a) covers in order to update Cov. This gives the time 
guaranteeof O (I^^,(rv + (^) + 1.^,^1,))) = 0(f^v + 
r(^l) + 1^5) = (r(t/) + r'S) with ^the average (,!,) 
value. A rough guess at the value of S would take the 
fact that ^ < ^ - 1 to have S < (,^/). This would give 
O (r(tli) + ^C-i))- But this is largely unsatisfactory as 
the original IPO framework takes O (rv(t~/)) time. This 



shows that the algorithm suffers from a large drawback 
because of this r^ term, which begs the question of 
whether a better bound for S can be found. 

A better analysis returns to the key step of the algo- 
rithm where we update T^ for each non-extended row,7. 
The number of /-tuples that are examined in this step is 
(fl^, but this is clearly dependent on the row that was 
last extended, /, and the row being updated, y", as s is the 
number of columns that have identical entries. With 
this step dominating much of the runtime of the algo- 
rithm it is critical to do a thorough analysis, and yet 
because of the extreme uncertainty in this property of 
covering arrays, it seems unlikely any rigorous argu- 
ment can be made other than (^!i) < (fl^). ^ - 2 is used 
because by the inductive step we know the first k - 1 
columns are a covering array made by this algorithm, 
hence no two rows are exactly the same. However, 
small sizes of the resulting covering arrays suggests 
that the rows must be highly dissimilar so we suspect 
on average that (^^ ^ G^f ). 

To deal with this, it seems reasonable to consider 
what would be the case if the first k - 1 columns were 
not a covering array, but instead were a random array. 
With this, we can take £'[(f!i)] ^s an approximation of 
the number of /-tuples that will be explored in this part 
of the algorithm. While it should be clear that the algo- 
rithm will examine many fewer /-tuples than this 
because we are dealing with covering arrays, no rigor- 
ous argument for this fact is made. Using indicator ran- 
dom variables we get E[(l^] = -^ (^^"/), which is much 
less than the worst case bound of (fly) in most cases. 
Using S ^ -h (t-i) we get the heuristic bound of 
0(rCt.')+7^''(t;)) = 0(^(t/)). 

For one stage of the vertical growth algorithm there 
is the clear worst case bound of O (rvXfli)) which relies 
on the fact that we are only checking values of A that 
contain the last column. Realistically, the v^ factor is an 
extremely weak upper bound as most /-tuples for any 
specific column tuple will be covered by the horizontal 
growth stage and thus no searching throughout the rows 
is performed. However, this argument doesn't seem to 
lend itself to a better analysis. 

To take a covering array of ^ - 1 parameters to k 
parameters we must combine both stages of the algo- 
rithm for a combined time bound of O ( -^ (fli) + 
rvXfli))- To get a total time bound for the algorithm we 
must generate the covering array from / columns up to 
the final k. Taking r as the final number of rows 
involved in the algorithm, this gives O (77 (J) + rv^(f)). 

We can take the heuristic argument one step further 
with the assumption, as suggested by experimental evi- 
dence in Fig. 3 and Fig. 6, that the resulting covering 
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array size meets a logarithmic bound, in particular the 
bound r < V* log(v^(^)). The approximate time bound 
becomes 0{v'^' log'(v'(^))(^) + v"' log(v^(f ))(^)). While 
this bound does not itself suggest the algorithm is fast, 
experimental evidence to be presented in Sec, 5.2 does. 



4, Heuristics 

4.1 Modifying Horizontal Growth 

One concern with computational approaches to cov- 
ering array construction is their time-intensive nature. 
In this section we give a modification of the presented 
horizontal growth algorithm that aims to heavily reduce 
the time required while still producing decently sized 
covering arrays for smaller values of v, such as those 
less than ten. 

To achieve this claim, we look at the step where 
TcVJi ^] is updated and recall how it is this step that 
dominates much of the time in the algorithm. We had to 
examine G!i) ^tuples to see if they were already cov- 
ered, and if not, we performed the operation of setting 
TcVJ^ <3f] <- T^j, a] + 1. It is easy to see then that T^\_j, d\ 
is incremented overall by no more than {^.^ and does not 
decrease. By using this information in an intelligent 
way we can avoid searching through any ^tuples at all. 

The idea is to take T^j, d\ <- T^j, a\ +f{n, sJ-\) 
with n as the number of already extended rows and/(/2, 

5, ^ - 1) as a function that guesses how much T^j, d\ 
should increase without performing any searching at 
all. A prime candidate for f(n, s,t-\) would be return- 
ing to the idea of a random array and using E^[(^l^)], 
however this did not yield competitive covering array 
sizes. For reasons as yet unexplained, simply taking 
f(n, s, t - I) = (fli), that is assuming that all ^tuples 
shared between row / andy were previously uncovered, 
yields competitive covering array sizes with a drastic 
reduction in time. Algorithm 3 gives the pseudocode 
for this approach. 



Algorithm 3 Heuristic Horizontal Growth 
r^[/, a] <— 0, V/, a 
while some row is non-extended do 

Find non-extended row / and value a so that /„ = 
C-i) - "^Ih ^] is maximum 
if/„ = Othen 

stop horizontal growth 
end if 

Extend row / with value a 
for all non-extended rowsy do 

S <— set of columns where row / andy have iden- 
tical entries 

TU a] <- TU a] + (fi) 
end for 
end while 



4.2 Analysis 

Notice that because we no longer search through any 
/-tuples, we no longer need the Cov array, so our space 
complexity is drastically reduced to just 0(rv), 

While the space savings are good, the particular 
point that makes this approach worthwhile is the speed 
gains. In particular, this sets ^S = in the time analysis 
for the original algorithm. This allows a rigorous worst- 
case bound to be set forth of O (r^v + r(J"/)) for one 
stage of horizontal growth. Combining this with the 
vertical growth stage over all iterations of the algorithm 
we get the bound 0(r^vk + r(f) + rv^f)). 
Approximations for r are less valid here because for 
larger values of v, such as those larger than 10, the 
heuristic introduces too much error into the process. 
However, aside from those values, this method clearly 
demonstrates a fast heuristic that, as shown in Sec. 5.3, 
can also produce decently sized covering arrays. 



5, Implementation and Empirical Results 

5.1 Implementation 

IPOG is currently implemented in a software pack- 
age called FireEye [8], which was written in Java, The 
algorithms outlined in this paper were also implement- 
ed in Java as IPO' and IPO", with IPO" the version with 
the heuristic horizontal growth algorithm. These imple- 
mentations are currently incorporated into FireEye as 
IPOG-F and IPOG-F2, respectively. Both versions of 
horizontal growth use randomization to break ties in the 
greedy selection. While this seems undesirable because 
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it lacks the repeatability of IPOG, which is determinis- 
tic, only minor differences in time and covering array 
size have been observed. However, this can still be 
important in a few notable cases. For example, with 
t = 5, V = 2, and k = 13, the smallest known size was 
previously 1 04 [2] but by running IPO' many times and 
taking the minimum size, the new bound of 103 was 
generated. A more typical result would be around 112. 
This is not a drastic gain so, if needed, one could prob- 
ably fix the seed for the pseudorandom number gener- 
ator and still be very confident in behavior matching 
what is described in this paper. 

As noted earlier, this paper will only talk about 
homogeneous alphabet covering arrays. The ideas scale 
to heterogeneous alphabets and the two implementa- 
tions IPO' and IPO" can deal with these situations. 
However, the performance gains described in this paper 
do not seem to extend to this more general setting as 
while competitive results are seen, IPOG seems to do 
better. 

5.2 Comparison to FireEye 

In this section graphs are presented to compare 
FireEye running the IPOG algorithm and IPO'. All runs 
were performed on a 2.6GHz AMD Opteron machine 
with 4GB of RAM allocated to the programs. The fol- 
lowing graphs compare FireEye and IPO' only for the 
case ^ = 3 and v = 3, but these results are representative 
of the many runs observed for small values of t and v. 

Figure 1 shows the amount of time IPO' takes to gen- 
erate each covering array. This time is rather modest. 
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Fig. 1. Execution time for ^ = 3, v = 3. 



Figure 2 takes the IPO' time as a normalizing factor for 
the time spent running FireEye for the same situation 
which gives a speedup ratio for IPO'. This ratio is 
greater than 1 except when k = 4 and greater than two 
for ^ > 8. The ratio is significantly large for even mod- 
est values of k and is clearly increasing with L The 
largest speedup ratio in this graph, 281, clearly shows 
the advantages of IPO'. 
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Fig. 2. Time comparisons for ^ = 3, v = 3. 

Figure 3 shows the sizes of the resulting covering 
arrays from IPO and IPO'. It is important to note that 
IPO' seems to maintain around a 5 % smaller covering 
array when compared to FireEye. This 5 % is important 
because it allows IPO' to produce the smallest covering 
arrays in the literature [2] for k > 208. This fits with the 
intuition that by searching a larger space, IPO' can 
achieve a more optimal result. These graphs show that 
IPO' has both a time and optimality advantage for ^ = 3 
and V = 3. Similar results have been seen in all situa- 
tions observed thus far. 

IPOG, and its implementation FireEye, have already 
shown to be competitive in both time and size compar- 
isons with other algorithms so these results suggesting 
that IPO' performs better than FireEye speaks well to 
its performance in general. We compare FireEye and 
IPO' to the DDA, the Deterministic Density Algorithm 
[1]. For / = 2, V = 4 and ^ = 100, IPO' gave a covering 
array of size 53 in 0.6 seconds. FireEye gave a cover- 
ing array of size 54 in 2.3 seconds. DDA is reported to 
give an array of size 51 in 24.9 seconds. While EPO' 
may not give the best array size, the time savings are 
significant. 
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Fig. 3. Size comparisons for ^ = 3, v = 3. 



5.3 Heuristic Performance 

It has been shown that IPO' is very efficient, but for 
extremely large covering arrays such as when ^ = 6, it 
still requires more time than might be feasible. The 
heuristic from above was theoretically shown to be 
much faster than the original idea. By implementing the 
heuristic as IPO", we have observed that it is competi- 
tive in size for small values of v. We demonstrate that 
fact in the case that t=6 and v = 2. Figure 4 shows the 



amount of time IPO" takes as a fimction of A:. In Fig. 5 
we show the time IPO' took normalized by execution 
time IPO". With these two graphs we see that the 
heuristic offers major time savings. While the time sav- 
ings are important, it is also key that this gain does not 
drastically increase the size of the resulting covering 
array. In Fig. 6, the array sizes are shown. This graph 
shows how in this case the covering array resulting 
from the heuristic is not significantly larger than the 
array produced by the original idea. This suggests that 
the guess of taking T^[j\ a] ^ T^[j\ a] + {^1^ is a good 
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Fig. 4. Execution time for ^ = 6, v = 2. 
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Fig. 6. Size comparisons for ^ = 6, v = 2. 



one and not too far from the actual value. It is worth 
noting that for larger values of v, the error introduced 
by our guess was too much to be practical and can be 
large enough to make IPO" run slower than IPO', It 
should be noted that for 38 < ^ < 80 the covering arrays 
generated by IPO' are the smallest known, as compared 
to [2]. 



5.4 Covering Array Numbers 

IPO' has been run for many small values of v and t 
for as large k as possible. In many situations, for large 
k, IPO' has created the smallest known covering array 
sizes, some of which have been noted already. In Fig. 7, 
the results for IPO' are compared with the best known 
numbers [2] in the case that t = 4 and v = 3, Notice how 
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Fig. 7. Covering array numbers for ( = 4, v = 3. 
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for values 52 < k < 500 (and for some k < 52), IPO' 
gives the best known covering array size seen in the ht- 
erature and further how the covering array size seems 
fairly linear in this log-plot, suggesting this algorithm 
does very well asymptotically This entire data-set took 
three weeks to generate but reached ^ = 52 in 31 sec- 
onds. All of these covering arrays were saved and are 
available upon request. 
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